DUAL-LOCO: Distributing Statistical Estimation Using Random Projections

نویسندگان

  • Christina Heinze
  • Brian McWilliams
  • Nicolai Meinshausen
چکیده

We present Dual-Loco, a communicatione cient algorithm for distributed statistical estimation. Dual-Loco assumes that the data is distributed across workers according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependencies between features available to di↵erent workers. We show that Dual-Loco has bounded approximation error which only depends weakly on the number of workers. We compare Dual-Loco against a state-of-theart distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy. In particular, Dual-Loco allows for fast cross validation as only part of the algorithm depends on the regularization parameter.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LOCO: Distributing Ridge Regression with Random Projections

We propose LOCO, a distributed algorithm which solves large-scale ridge regression. LOCO randomly assigns variables to different processing units which do not communicate. Important dependencies between variables are preserved using random projections which are cheap to compute. We show that LOCO has bounded approximation error compared to the exact ridge regression solution in the fixed design...

متن کامل

Application of Clustering in the Non-Parametric Estimation of Distribution Density

Abstract. This paper discusses a multimodal density function estimation problem of a random vector. A comparative accuracy analysis of some popular non-parametric estimators is made by using the Monte-Carlo method. The paper demonstrates that the estimation quality increases significantly if the sample is clustered (i.e., the multimodal density function is approximated by a mixture of unimodal ...

متن کامل

Simulation uncertainty of complex economic system behavior

Property relation management of the (ownership, disposable, using) limited resources, which naturally occurs in the economic systems, face a problem uncertainty behavior of its active elements. The model of identification and forecasting of the economic system trajectory states in time is offered in work, which allows complex estimation its conduct from positions of risk (additive distributing)...

متن کامل

Using Stable Random Projections

Abstract Many tasks (e.g., clustering) in machine learning only require the lα distances instead of the original data. For dimension reductions in the lα norm (0 < α ≤ 2), the method of stable random projections can efficiently compute the lα distances in massive datasets (e.g., the Web or massive data streams) in one pass of the data. The estimation task for stable random projections has been ...

متن کامل

Improving Random Projections Using Marginal Information

We present an improved version of random projections that takes advantage of marginal norms. Using a maximum likelihood estimator (MLE), marginconstrained random projections can improve estimation accuracy considerably. Theoretical properties of this estimator are analyzed in detail.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016